A L2 Discrepancy Learning Process with Applications to Outlier and Insider Detections With Large High-Dimensional Data
نویسنده
چکیده
In this paper, a discrepancy-based framework is first presented for outlier and insider detections purpose. Given any sequence of profiles, a local discrepancy first identifies regions where the profiles are clumped or scarce then a global L2 discrepancy summarizes the overall distribution patterns of the data into one real value. A L2 discrepancy learning process is formulated to rank each profile in the sequence on the basis of optimizing the L2 discrepancy value. This L2 discrepancy learning process allows an access to many levels of information about outliers and insiders in the data. Experimental results are given to demonstrate the application of the L2 discrepancy learning process with different features data sets showing that the algorithm efficiently detects the outliers and insiders in the data.
منابع مشابه
Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کاملOutlier detection for high dimensional data pdf
Is particularly useful for high dimensional data where outliers cannot be found.High dimensional data in Euclidean space pose special challenges to data. In about just the last few years, the task of unsupervised outlier detection has found.Outlier detection is an outstanding data mining task referred to open pdf with mac word class="text" href="https://tokiqivy.files.wordpress.com/2015/06/opel...
متن کاملOutlier Detection in Random Subspaces over Data Streams: An Approach for Insider Threat Detection
Insider threat detection is an emergent concern for industries and governments due to the growing number of attacks in recent years. Several Machine Learning (ML) approaches have been developed to detect insider threats, however, they still suffer from a high number of false alarms. None of those approaches addressed the insider threat problem from the perspective of stream mining data where a ...
متن کاملImage alignment via kernelized feature learning
Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...
متن کاملOn the Interplay of Self-Esteem, Proficiency Level, and Language Learning Strategies Among Iranian L2 Learners
It is axiomatic that L2 teaching and learning is a process that requires dynamic involvement of L2 learners in the acquisition of knowledge and skills. L2 learners need to be assisted in setting individual learning goals. They should also be given the exposure to and guidance in effective language learning strategies (LLSs) in order to build a high level of confidence in the learning process. T...
متن کامل